Extracting Knowledge from the Geometric Shape of Social Network Data Using Topological Data Analysis

نویسندگان

  • Khaled Almgren
  • Minkyu Kim
  • JeongKyu Lee
چکیده

Topological data analysis is a noble approach to extract meaningful information from high-dimensional data and is robust to noise. It is based on topology, which aims to study the geometric shape of data. In order to apply topological data analysis, an algorithm called mapper is adopted. The output from mapper is a simplicial complex that represents a set of connected clusters of data points. In this paper, we explore the feasibility of topological data analysis for mining social network data by addressing the problem of image popularity. We randomly crawl images from Instagram and analyze the effects of social context and image content on an image’s popularity using mapper. Mapper clusters the images using each feature, and the ratio of popularity in each cluster is computed to determine the clusters with a high or low possibility of popularity. Then, the popularity of images are predicted to evaluate the accuracy of topological data analysis. This approach is further compared with traditional clustering algorithms, including k-means and hierarchical clustering, in terms of accuracy, and the results show that topological data analysis outperforms the others. Moreover, topological data analysis provides meaningful information based on the connectivity between the clusters.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Design and Test of the Real-time Text mining dashboard for Twitter

One of today's major research trends in the field of information systems is the discovery of implicit knowledge hidden in dataset that is currently being produced at high speed, large volumes and with a wide variety of formats. Data with such features is called big data. Extracting, processing, and visualizing the huge amount of data, today has become one of the concerns of data science scholar...

متن کامل

A Comparative Study on Body Shape of the Genus Alburnus (Rafinesque, 1820) in Iran, Using Geometric Morphometric Analysis

Geometric morphometric method was used to examine body shape variations among all the seven valid species of the genus Alburnus in Iran. In total 409 specimens of A. chalcoides, A. filippii, A. atropatenae, A. caeruleus, A. mossulensis, A. hohenackeri and A. zagrosensis were collected from Babolrud, Baleqlu-Chai, Miriseh, Sarabeleh, Gamasiyab, Mahabad-Chai Rivers and the Gandoman lagoon, respec...

متن کامل

Body shape comparison of Big-head carp with two variants of silver carp using geometric morphometric techniques

In order to investigate the differences in the shape of the big-head and two morphotypes of Phytophagous fish by geometric morphometric technique, 30 samples of each group, with a mean standard length of 25 ± 3 cm, were prepared. After fish were anesthetized by 1% Clove powder extract and stabilized in formalin buffer 4%, 6-megapixel digital images of the left profile of the specimens were prep...

متن کامل

The morphological study of transcaspian marinka (Schizothorax pelzami) in Harirud and Dasht-e Kavir basins using the geometric morphometric technique

This study was conducted to understand the morphological variation of four populations of Schizothorax pelzami from Iranian inland waters. For this purpose, a total of 81 specimens were collected from the Bidvaz, Cheshmeh-Ali, Kalat and Aal rivers. After anesthesia, all specimens were fixed into buffered formalin and transferred to the laboratory. Then, the left sides of all individuals were ph...

متن کامل

Application of Big Data Analytics in Power Distribution Network

Smart grid enhances optimization in generation, distribution and consumption of the electricity by integrating information and communication technologies into the grid. Today, utilities are moving towards smart grid applications, most common one being deployment of smart meters in advanced metering infrastructure, and the first technical challenge they face is the huge volume of data generated ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Entropy

دوره 19  شماره 

صفحات  -

تاریخ انتشار 2017